fix(db): fix number of files in db, startup hang, ram issues and flushing issues #379

cchudant · 2024-11-06T16:00:39Z

Pull Request type

Bugfix

What is the current behavior?

Pragma devnet has >1M sst files. This is because compaction couldn't keep up with the atomic flushes we do after every block. This resulted in db startup taking an absurd amount of time (sometimes tens of minutes, scanning most of those >1M files). In addition, the flushes were also taking a lot of time (~10s!).

What is the new behavior?

I could replicate the issue by copying the pragma db locally, but I could also replicate the issue by syncing a node with a modified version of block-import that force-flushes every block (this mimicks the behavior of the sequencer which does the same) and watching the number of files in the db. This was much easier.

The tiered compaction is not adapted for our needs, so I switched the db to universal compaction. I managed to sync 100k blocks with only a hundred sst files instead of around 30 thousand files in the db with this compaction mode. Hopefully, this should fixes all of these issues + the RAM taken by rocksdb should now be bounded.

Does this introduce a breaking change?

Theorically, no - but i think you should upgrade your sequencer db by syncing a full-node on it, and replacing the sequencer's db with the full-node's to benefit of all of the new db options. Using an old db will work, but it won't try to update the old sst files very much I think.

…hing issues

antiyro

lgtm if you could just give some comments on the current used parameters

antiyro · 2024-11-07T08:59:16Z

crates/client/db/src/rocksdb_options.rs

+    options.set_max_subcompactions(cores as _);
+
+    options.set_max_log_file_size(10 * MiB);
+    options.set_max_open_files(2048);


could you add a comment describing each of those numbers?

Yes, or at least a doc for the function explaining how it addresses the issue this PR is solving.

the max open files was added in #367
i'll add info to the options, i may also solve #231 while i'm at it if it doesnt take me too long

antiyro · 2024-11-07T09:00:02Z

crates/client/db/src/rocksdb_options.rs

+        options.set_compression_type(DBCompressionType::Zstd);
+        match self {
+            Column::BlockNToBlockInfo | Column::BlockNToBlockInner => {
+                options.optimize_universal_style_compaction(1 * GiB);


same here why 1GB?

i can't access to the column sizes metrics as of now, so I eyeballed it
we can tune all of this later, the basic idea is that this arg is the RAM budget of the column - i like to think that a normal machine should run madara in 4Go RAM max, and base these tuning params around that

I have not run any test to see if these numbers are optimal they just looked okay to me.

crates/client/db/src/rocksdb_options.rs

shamsasari · 2024-11-07T10:11:30Z

crates/client/db/src/rocksdb_options.rs

+    options.set_max_subcompactions(cores as _);
+
+    options.set_max_log_file_size(10 * MiB);
+    options.set_max_open_files(2048);


Yes, or at least a doc for the function explaining how it addresses the issue this PR is solving.

shamsasari · 2024-11-20T11:19:17Z

Is this good to merge? I will need to deal with the merge conflicts in #388.

fix(db): fix number of files in db, startup hang, ram issues and flus…

68b2aa7

…hing issues

antiyro approved these changes Nov 7, 2024

View reviewed changes

shamsasari suggested changes Nov 7, 2024

View reviewed changes

Trantorian1 approved these changes Nov 18, 2024

View reviewed changes

cchudant self-assigned this Nov 19, 2024

Merge branch 'main' into fix-rocksdb-options

75e4418

shamsasari approved these changes Nov 20, 2024

View reviewed changes

antiyro merged commit 018b6cc into main Nov 20, 2024
10 checks passed

cchudant mentioned this pull request Nov 20, 2024

dev docs: add documentation to the rocksdb configuration #392

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(db): fix number of files in db, startup hang, ram issues and flushing issues #379

fix(db): fix number of files in db, startup hang, ram issues and flushing issues #379

cchudant commented Nov 6, 2024 •

edited

Loading

antiyro left a comment

antiyro Nov 7, 2024

shamsasari Nov 7, 2024

cchudant Nov 7, 2024

antiyro Nov 7, 2024

cchudant Nov 7, 2024 •

edited

Loading

shamsasari Nov 7, 2024

shamsasari commented Nov 20, 2024

fix(db): fix number of files in db, startup hang, ram issues and flushing issues #379

fix(db): fix number of files in db, startup hang, ram issues and flushing issues #379

Conversation

cchudant commented Nov 6, 2024 • edited Loading

Pull Request type

What is the current behavior?

What is the new behavior?

Does this introduce a breaking change?

antiyro left a comment

Choose a reason for hiding this comment

antiyro Nov 7, 2024

Choose a reason for hiding this comment

shamsasari Nov 7, 2024

Choose a reason for hiding this comment

cchudant Nov 7, 2024

Choose a reason for hiding this comment

antiyro Nov 7, 2024

Choose a reason for hiding this comment

cchudant Nov 7, 2024 • edited Loading

Choose a reason for hiding this comment

shamsasari Nov 7, 2024

Choose a reason for hiding this comment

shamsasari commented Nov 20, 2024

cchudant commented Nov 6, 2024 •

edited

Loading

cchudant Nov 7, 2024 •

edited

Loading